Matrix Entry-wise Sampling: Simple is Best
نویسندگان
چکیده
Sparsfying matrices is a ubiquitous operation in large scale machine learning, data mining and signal processing. More formally, given a large matrix A, we aim to find another matrix B, such that }A B} ¤ ε with B being significantly sparser than A. Using B as a surrogate for A is more efficient and often provides provably good approximations for many tasks. In this paper, we suggest an element-wise sampling scheme for producing B. We prove it is superior to previously suggested schemes using a relatively new matrix-valued version of the Bernstein inequality, which is known to be tight up to logarithmic factors. Moreover, the sampling scheme can be executed in the streaming model where single matrix non-zeros are presented to the algorithm in an arbitrary order. We support our theoretical findings with experimental results that corroborate our claims.
منابع مشابه
A Very Fast Method for Clustering Big Text Datasets
Large-scale text datasets have long eluded a family of particularly elegant and effective clustering methods that exploits the power of pair-wise similarities between data points due to the prohibitive cost, timeand space-wise, in operating on a similarity matrix, where the state-of-the-art is at best quadratic in time and in space. We present an extremely fast and simple method also using the ...
متن کاملA Block-Wise random sampling approach: Compressed sensing problem
The focus of this paper is to consider the compressed sensing problem. It is stated that the compressed sensing theory, under certain conditions, helps relax the Nyquist sampling theory and takes smaller samples. One of the important tasks in this theory is to carefully design measurement matrix (sampling operator). Most existing methods in the literature attempt to optimize a randomly initiali...
متن کاملThe Leave-one-out Approach for Matrix Completion: Primal and Dual Analysis
In this paper, we introduce a powerful technique, Leave-One-Out, to the analysis of lowrank matrix completion problems. Using this technique, we develop a general approach for obtaining fine-grained, entry-wise bounds on iterative stochastic procedures. We demonstrate the power of this approach in analyzing two of the most important algorithms for matrix completion: the non-convex approach base...
متن کاملOn the spectra of reduced distance matrix of the generalized Bethe trees
Let G be a simple connected graph and {v_1,v_2,..., v_k} be the set of pendent (vertices of degree one) vertices of G. The reduced distance matrix of G is a square matrix whose (i,j)-entry is the topological distance between v_i and v_j of G. In this paper, we compute the spectrum of the reduced distance matrix of the generalized Bethe trees.
متن کاملA note on element-wise matrix sparsification via a matrix-valued Bernstein inequality
Given a matrix A ∈ R, we present a simple, element-wise sparsification algorithm that zeroes out all sufficiently small elements of A and then retains some of the remaining elements with probabilities proportional to the square of their magnitudes. We analyze the approximation accuracy of the proposed algorithm using a recent, elegant non-commutative Bernstein inequality, and compare our bounds...
متن کامل